Context-based visual feedback recognition

نویسنده

  • Louis-Philippe Morency
چکیده

During face-to-face conversation, people use visual feedback (e.g., head and eye gesture) to communicate relevant information and to synchronize rhythm between participants. When recognizing visual feedback, people often rely on more than their visual perception. For instance, knowledge about the current topic and from previous utterances help guide the recognition of nonverbal cues. The goal of this thesis is to augment computer interfaces with the ability to perceive visual feedback gestures and to enable the exploitation of contextual information from the current interaction state to improve visual feedback recognition. We introduce the concept of visual feedback anticipation where contextual knowledge from an interactive system (e.g. last spoken utterance from the robot or system events from the GUI interface) is analyzed online to anticipate visual feedback from a human participant and improve visual feedback recognition. Our multi-modal framework for context-based visual feedback recognition was successfully tested on conversational and non-embodied interfaces for head and eye gesture recognition. We also introduce Frame-based Hidden-state Conditional Random Field model, a new discriminative model for visual gesture recognition which can model the substructure of a gesture sequence, learn the dynamics between gesture labels, and can be directly applied to label unsegmented sequences. The FHCRF model outperforms previous approaches (i.e. HMM, SVM and CRF) for visual gesture recognition and can efficiently learn relevant contextual information necessary for visual feedback anticipation. A real-time visual feedback recognition library for interactive interfaces (called Watson) was developed to recognize head gaze, head gestures, and eye gaze using the images from a monocular or stereo camera and the context information from the interactive system. Watson was downloaded by more then 70 researchers around the world and was successfully used by MERL, USC, NTT, MIT Media Lab and many other research groups. Thesis Supervisor: Trevor Darrell Title: Associate Professor

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dialogue Context for Visual Feedback Recognition

Head pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. When recognizing visual feedback, people use more than their visual perception. Knowledge about the current topic and expectations from previous utterances help guide our visual perception in recognizing nonverbal cues. In this chapter, we investigate how dial...

متن کامل

Towards Context-Based Visual Feedback Recognition for Embodied Agents

Head pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. We investigate how contextual information can improve visual recognition of feedback gestures during interactions with embodied conversational agents. We present a visual recognition model that integrates cues from the spoken dialogue of an embodied agent with...

متن کامل

Conditional Sequence Model for Context-Based Recognition of Gaze Aversion

Eye gaze and gesture form key conversational grounding cues that are used extensively in face-to-face interaction among people. To accurately recognize visual feedback during interaction, people often use contextual knowledge from previous and current events to anticipate when feedback is most likely to occur. In this paper, we investigate how dialog context from an embodied conversational agen...

متن کامل

Head gestures for perceptual interfaces: The role of context in improving recognition

Head pose and gesture offer several conversational grounding cues and are used extensively in face-to-face interaction among people. To accurately recognize visual feedback, humans often use contextual knowledge from previous and current events to anticipate when feedback is most likely to occur. In this paper we describe how contextual information can be used to predict visual feedback and imp...

متن کامل

Believable Visual Feedback in Motor Learning Using Occlusion-based Clipping in Video Mapping

Gait rehabilitation systems provide patients with guidance and feedback that assist them to better perform the rehabilitation tasks. Real-time feedback can guide users to correct their movements. Research has shown that the quality of feedback is crucial to enhance motor learning in physical rehabilitation. Common feedback systems based on virtual reality present interactive feedback in a monit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006